Churn and Default

Introduction

to-do: add intro

Literature Review

to-do: to be added/folded into the Introduction to frame the macro-scope of this study

note

Analysis is awaiting Experian appending.

This analysis summaries the Nautilus customer data from the Frog Hollow and Howell community solar farm participation from January 2020 to April 2022. The goal of the analysis is to first describe the characteristics of the residents, the prevalence of default and/or churn rates, the prevalence of churn rates, and any statistically significant differences between groups in their default, late payments or churn rates.

Methods

The data is comprised of monthly payment performance and, where available, demographic data for residents. 32,385 monthly observations over 813 homes, 812 accounts and 621 users were initially observed. To avoid data quality issues and to analyze any potential churn and default trends, for Frog Hollow households the analysis will drop values in October 2021 until April 2022. This narrowed dataset is comprised of 31,704 observations; the homes, accounts and users did not change.

The first ten rows of the data are presented, grouped by account number. Unique identifiers were replaced with sequential IDs.

  • Tenure: length in months each account was active
  • Payment_Method: either card or ACH/direct debit method
  • temp_solar: annual (right?) solar allocation in kWh
  • Churn: binary _var_iable capturing if an account left the solar farm
  • defauted: binary variable capturing if account holder defaulted on payment
  • Income: Low (<$50,000) or High (>$50,000)
  • Solar_Farm: Either Howell or Frog Hollow
First 10 Home Rows
Account_ID tenure payment_method temp_solar Churn solar_farm defaulted count_user_id Income
1 17 card 25.68 0 Howell (O&R) 0 1 NA
2 22 card 6.65 0 Howell (O&R) 0 1 NA
3 18 card 0.96 1 Howell (O&R) 0 1 NA
4 24 card 83.40 0 Howell (O&R) 0 1 NA
5 25 card 6.05 0 Howell (O&R) 0 1 High
6 23 ach 14.54 0 Howell (O&R) 0 1 NA
7 23 ach 30.65 0 Howell (O&R) 0 1 NA
8 3 NA 24.87 0 Frog Hollow (CHGE) 0 1 High
9 25 card 7.92 0 Howell (O&R) 0 1 NA
10 23 card 11.20 0 Howell (O&R) 0 1 High

Descriptive Statistics

Total

The summary statistics below provide additional information. The average tenure of the 812 accounts was 20.3 months. The churn rate was 12.4%. The sample does skew towards high income homes, however note a large number of missing values were observed in income (567, 70%).

Descriptive Statistics
Total Mean Max Min
Tenure 16486.00 20.30 25.0 1
Defaults 6.00 0.01 1.0 0
Churn 118.00 0.15 1.0 0
Low Income 21.00 0.09 NA NA
High Income 224.00 0.91 NA NA
Solar 6294.78 7.75 83.4 0
Payment by Card 565.00 0.70 NA NA
Payment by ACH 239.00 0.30 NA NA

Frog Hollow

Frog Hollow Descriptive Statistics
Total Mean Max Min
Tenure 8799.00 19.38 22.00 2
Defaults 1.00 0.00 1.00 0
Churn 59.00 0.13 1.00 0
Low Income 9.00 0.21 NA NA
High Income 33.00 0.79 NA NA
Solar 3269.62 7.20 48.14 0
Payment by Card 322.00 0.72 NA NA
Payment by ACH 125.00 0.28 NA NA

Howell

Howell Descriptive Statistics
Total Mean Max Min
Tenure 7687.00 21.47 25.0 1
Defaults 5.00 0.01 1.0 0
Churn 59.00 0.16 1.0 0
Low Income 12.00 0.06 NA NA
High Income 191.00 0.94 NA NA
Solar 3025.16 8.45 83.4 0
Payment by Card 243.00 0.68 NA NA
Payment by ACH 114.00 0.32 NA NA

Tenure Length

Obersvations are next grouped by tenure length.

Account Tenure Length Statistics
Number of homes Churned Defaulted Average Solar Allocation Payment by Card Payment by ACH
≤ 5 29 12 0 5.01 18 3
5 to 10 42 24 0 7.27 35 7
10 to 15 23 16 3 7.66 19 4
15 to 20 718 66 3 7.89 493 225
Total 812 118 6 7.75 565 239

For homes that churned, the tenure was reasonbly much shorter than those that did not churn. The below graph shows the average tenure between these two groups, showing that it is almost double for those homes that exhibit continuous subscription.

Income

Resident Income Statistics
Income Number of accounts Churned Defaulted Average Solar Allocation Payment by Card Payment by ACH
High 224 54 3 6.56 183 37
Low 21 9 1 4.54 16 4
NA 567 55 2 8.34 366 198
Total 812 118 6 7.75 565 239

Churn Reasons

When a respondent left the program, a reason was recorded, when available. The below summarises the accounts that experienced a churn, which may further be collapsed into categories. Note the total below may not match the churn totals in the above descriptive statistics due to defaulted payments and duplicates included.

temp <- raw %>%
 group_by(user_id,utility_acct_number) %>%
  summarise(tenure = length(unique(date_concat)),
            count_accts = n(),
            default = sum(default_tag=="Defaulted Payment"),
           # count_accts = length(unique(utility_acct_number)) ,
            count_churn = sum(default_tag !=0 & default_tag !="Defaulted Payment")/length(unique(date_concat))) #%>%
## `summarise()` has grouped output by 'user_id'. You can override using the
## `.groups` argument.
 # filter(count_accts > 1) 

temp
## # A tibble: 812 × 6
## # Groups:   user_id [620]
##    user_id utility_acct_number tenure count_accts default count_churn
##      <dbl> <chr>                <int>       <int>   <int>       <dbl>
##  1      17 1769073011              23          23       0        0   
##  2      19 7989248005               3           3       0        1   
##  3      31 689005016               21          44       0        2.10
##  4      32 21002383038             22          23       0        0   
##  5      33 2155441000              15          30       0        2   
##  6      33 2536130080              17          34       0        2   
##  7      37 21003282874             22          46       0        0   
##  8      38 21001139837             22          69       0        0   
##  9      39 7376061001              22          44       0        0   
## 10      42 4381440011              15          15       0        1   
## # … with 802 more rows

Results

Churn Figures

Solar

Tenure & Churn by Payment Method

Default Figures

Solar

Tenure & Default by Payment Method

Model

A logit model is deployed to analyze the probability of a home churning. The model in brief is described below. Without additional data collected from Experian, limited variability can be measured due to NA values in addition to limited number of coefficients to measure.

In the limited data available, statistically significant effects were measured for probability of churn for both tenure (Months) and days late. Due to relatively small number of observations of late payments, no statistically significant results were measured in probability of late or default payments.

\[\begin{equation} P_{d} = \beta_{0} + \beta_{1}*Months +\beta_{2}*(Payment Method = Card) +\beta_{3}*Income=Low \\ +\beta_{4}*Solar kwH + \beta{5}*(Solar Farm=Howell) \end{equation}\]

## 
## Call:
## glm(formula = Churn ~ tenure + payment_method + Income + temp_solar + 
##     solar_farm, family = binomial(link = "logit"), data = joined)
## 
## Deviance Residuals: 
##     Min       1Q   Median       3Q      Max  
## -2.8845  -0.4531  -0.4198   0.2479   2.2907  
## 
## Coefficients:
##                        Estimate Std. Error z value       Pr(>|z|)    
## (Intercept)             5.26675    1.03903   5.069 0.000000400133 ***
## tenure                 -0.26409    0.04308  -6.131 0.000000000876 ***
## payment_methodcard     -0.24492    0.51325  -0.477          0.633    
## IncomeLow               0.55029    0.63145   0.871          0.383    
## temp_solar             -0.01833    0.04537  -0.404          0.686    
## solar_farmHowell (O&R) -1.15128    0.50626  -2.274          0.023 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 276.31  on 239  degrees of freedom
## Residual deviance: 176.59  on 234  degrees of freedom
##   (572 observations deleted due to missingness)
## AIC: 188.59
## 
## Number of Fisher Scoring iterations: 5

The logit curves below show the relationship between the distribution of both tenure and days late with the probability of churning. The first graph shows that the longer the tenure, the lower the likelihood. The second graph shows that the days of late payments are generally not associated with likelihood of churn.

Discussion